Abstract: Wind energy being a major source of energy has become an interesting area of research. Wind farm power curve monitoring and wind power prediction are the constituent elements of the integrated wind energy research. As absolute modeling of wind source is nearly impossible and as wind turbines are nonlinear, data mining methods are preferred over analytic method to obtain high level information from low level data collected from data acquisition systems. The inputs required are wind power and wind speed magnitudes. Based on that raw wind data are classified into valid or invalid data using unsupervised algorithm. The categorization of data is done into six categories mainly valid, missing, constant, exceeding, irrational and unnatural. Outlier detection is done to filter out the data. Variance and bias are taken into consideration while using dif-ferent approximation power curve models for data detection. Local Outlier Factor is incorporated, along with the similarity measures used. Weighted distance is calculated to avoid non detection of relevant data points.
Keywords: Data Mining, Wind Data Preprocessing, Wind energy, Local Outlier Factor, Weighted distance.